Quantum Theory and Probabilities 



E.C.G. Sudarshan 
Department of Physics and Center for Particle Physics 
University of Texas, Austin, Texas 78712-10810 

Abstract 

It is often stated that quantum mechanics only makes statistical predictions and that a 
quantum state is described by the various probability distributions associated with it. Can 
we describe a quantum state completely in terms of probabilities and then use it to describe 
quantum dynamics? What is the origin of the probability distribution for a maximally 
specified quantum state? Is quantum mechanics 'local' or is there an essential nonlocality 
(nonseparability) inherent in quantum mechanics? These questions are discussed in this 
paper. The decay of an unstable quantum state and the time dependence of a minimum 
uncertainty states for future times as well as past times are also discussed. 
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1 Introduction: Classical Dynamics 

The standard form of classical dynamics involved first-order equations of motion of the phase 
point representing the system. For a collection of particles the phase point is specified by 6N 
variables, 3N position coordinatres and 3N momenta. The equations of motion would then 
furnish the trajectory of this phase point. These are nonintersecting continuous curves (except 
at the points where the Hamiltonian is irregular). Given an initial phase point, the dynamics 
specifies the final point for any time. 'Adjacent' phase points often lead to adjacent end points. 
(Note that in a symplectic space like the phase space, there is no canonically defined 'distance'.) 

The 'atomic' propositions for a classical system are phase points; in a state specified by an 
atomic proposition all dynamical variables have definite values. All dynamical variables are 
compatible observables which can be simultaneously specified. [1] 

In many cases one may choose a non-external initial condition: given by a (nonnegative) 
phase-space density. Then for a Hamiltonian evolution this goes into a statistical state. More 
generally, the statistical state is a distribution on the phase space which maps dynamical vari- 
ables into numbers. Only distributions which are 'measures' in that they are real and non- 
negative everywhere are allowed. A simple alternate characterization is by the expectation 
value 

X(A, n) = (exp i(Xp + fiq)) (1) 

which is called the characteristic function. [2] It is bounded in magnitude by unity. 

In addition to the distribution nature of the initial condition, we have two other cases where 
it is useful to consider such statistical descriptions. One is for a nonlinear system which has 
very complicated trajectories which are unstable: for small changes in the initial condition the 
final state is vastly different. In these cases the trajectory description is not useful, [3] even 
though the trajectories do not cross each other (except for phase points which are singular). 
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Another context is for stochastic dynamics which do not map points into points but only dis- 
tributions into distributions. A natural origin for such a behavior is to have an open dynamical 
system. The stochastic behavior is classified into Markovian and non-Markovian processes [4]: 
in the Markovian case only the immediate past is sufficient to determine the future evolution; 
it has no 'memory'. A non-Markovian process can be 'lifted' to be a Markovian system with a 
larger phase space. [Example: Multiple scattering of a changed particle: transverse displace- 
ments alone do not give a Markovian process, but if we include the slopes of the trajectories 
also, we can get a Markovian process.] [5] 

One method of getting a stochastic process from a deterministic dynamics is by 'contraction', 
that is ignoring some degrees of freedom. In many cases we can 'lift' the stochastic process into 
a deterministic dynamics of an extended dynamical system. 

A stationary Markovian stochastic process possesses the semigroup structure: if A(t)p is 
the map of the phase space distribution p by the Markovian process, then 

A(t 1 )A(t 2 )p = A(t 1 +t 2 )p, t 1 >0,t 2 >0. (2) 

The map of the distribution is an 'into' map corresponding to a contraction. The contractive 
semigroup has a dissipative part. The generator of the semigroup has an 'imaginary' part. 

It is often stated that for a deterministic dynamical evolution with very complicated trajec- 
tories, there is a loss of information in the forward direction and hence the evolution produces 
an arrow of time. [6] This 'fundamental resolution', first offered by Boltzmann and then refined 
by a succession of authors, raises a question: how can a reversible evolution lead, by itself and 
without approximation, to an irreversible process? But if we trace the evolution to the past 
from the present state, we find that there is again an apparent 'loss of information' to the 
past! In other words, there is an unstable trajectory which leads to the simple present state 
only appears to have lost information. So within the closed dyamical system with a reversible 
dynamics, an arrow of time cannot be discerned. This is essentially the Loschmidt objection to 

3 



Boltzmann's assertion. The unstable, nonlinear system makes it less obvious but the Loschmidt 
dejection still obtains. It is also important to note that only 'reversibility' is sufficient: 'time 
reversal invariance' is not required. 

2 Quantum Dynamics: Characteristic Functions and 
Distributions 

Quantum systems are characterized by the superposition principle [7] : atomic (external) states 
can be obtained which are superpositions of two distinct states. This property automatically 
leads to noncommuting dynamical variables which cannot be simultaneously specified. For 
example, a wave function ip(x) yields a position probability distribution 

P(x) = Mx)\* (3) 

so that 

/(*) -> (/(*)) = J P(x) f(x)dx . (4) 

The position probability distribution has less information than ip(x) (even ip(x) module an 
overall phase) since there is no information about the momentum: but given ip(x) we can get 
the momentum space function 

^(p) = J (2vr) 3 y i){x)e ipx dx . (5) 
from which the momentum-related dynamical variables have the expectation value 

g(p) -»• (g(p)) = J fp(p)^(p)g(p)dp. (6) 

But, what about functions of both p and q? For the characteristic function 

X (\, = (e iXp+i n (7) 
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we may make a canonical transformation 



AP + //Q (-pP + \Q) 
Q — >■ r ) P - ^ — ■ (°) 



Then 



X(\») = (e iQ ) = J ^Q)e*>/VW 1>(Q)dQ . (9) 

Since any bounded operator of g and p can be expressed in terms of the unitary operators 
exp{i(\p + /iq)}, their expectation values can be computed given the wave function. 

The characteristic function can be computed in terms of the wave function ip(q) as follows: 

exp(i(\p + fiq) = exp (i^J exp(i/xqr) exp (i^J ■ ( 10 ) 

Hence 

(exp ( a> P + W » = / e- e^, = /*•(.- |) ^ ♦ (.+ *)*. 

(11) 

These calculations pertain to the atomic propositions corresponding to 'pure states.' For a 
statistical state we take mixtures of states with suitable probability weights according to 

Pi = ipiipl , P2 = *2^2 

p = cos 2 6 pi + sin 2 9 p 2 (12) 

and more generally 

p = Hpipi , Epi = 1 , < . (13) 
The characteristic functions obey the same law: 

X (A, m) = Sp,A,(A, /x) , Hpi = 1, Pi > . (14) 

For classical dynamics, given the characteristic function %(A,/i) , the multivariate phase- 
space distribution function is given by the double Fourier transform (for N particles): 

(p(p, q) = 2n)- bN J f e-^ + ^)x{\ »)d\dfi . (15) 



By definition the phase space density p(p, q) is nonnegative and integrate to unity: 

p(p,q)>0 , II p(p,q)pdq=l. (16) 



So p is a proper probability measure. The same construction may be carried out for quantum 
characteristic functions: [8] 

W(p, q) j^- J f X (A, »)d\d» . (17) 

The normalization condition is still valid. 

j I W(p, q)dpdq = 1 . (18) 

But it is no longer nonnegative. W may become negative pointwise in the phase space. There 
is however a positivity condition: 

((f(p,q)f) >0. (19) 
W is the Wigner phase space density. [9] It also satisfies the condition 

If W 2 (p,q)dpdq = j^ w =trp (20) 

for a pure state and 

// W\p,q)dpdq = ^i<l. (21) 

Given the phase space density p(p, q) or W(p, q), the equations of motion can be transcribed 
in terms of these. For classical dynamics we have the Liouville equations 

j t p(q,p,t) = -t[H,p](q,p,t) . (22) 



while for quantum dynamics[10] 



j t W(q,p,t) = (MW)(q,p,t) (23) 
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where the Moyal operator M is defined as 



Mp(pq) = - H(p, q) I sin 




) 



P(P, q) ■ 



(24) 



This is a nonlocal operator (except for a polynomial Hamiltonian) but it preserves the purity 
of the state. In place of canonical variables we may have other variables which do not commute. 
A particularly important case is that of a spin system either by itself or in terms of spinning 
particles. We will discuss this later in this presentation. 



3 Distributions over Non- commuting Variables: Spectra 



If B, C, are noncommuting operators for a quantum system, then with every dynamical variable 
we have spectrum of possible 'eigenvalues;' and according to the von Neumann postulate, any 
measurement yields one of these eigenvalues with a frequency of each eigenvalue depending on 
the state ip (or p ) of the quantum system. Every state assigns for any dynamical variable B a 
probability distribution, and the expectation value of any function of B can be computed using 
this probability distribution. If B and C commute, then we can consider a spectrum for the pair 
(B, C); and so on. For example, for a spinning particle in an inhomogeneous magnetic field, we 
can specify both the momentum components and the spin projection along the gradient of the 
magnetic field. The first has a continuous spectrum and the second a discrete Stern-Gerlach 
spectrum. The measurement postulate[ll] leaves out the definition of 'measurement'. For Bohr 
and Heisenberg this 'quantum jump' into an eigenvalue is instantaneous and indescribable; but 
Schrodinger insists that it is a process however rapid the transition [12]. We will comment on 
this later. Given a quantum state and an operator B, we have a spectrum of values. What we 
can calculate is the probability distribution but not the actual value obtained in an experiment. 
In this respect, quantum probabilities are no different from classical probabilities: any particular 
'realization' is random. What we can compute or predict is the probability distribution. In 
classical physics we do not appeal to the 'collapse of a probability distribution' into a unique 



7 



measured value: why should we then talk about the collapse of a quantum distribution? In 
both cases an immediate remeasurement yields the same definite value. The only difference 
between the observed and quantum distribution is that in the former we can have a probability 
distribution for the dynamical variable, but in the latter we can specify at most a complete 
commuting set. In particular, for the classical case, since the equation of motion change a 
dynamical variable into a dynamical variable, we can have a complete history of the particle's 
time evolution. For a Hamiltonian system these are a set of continuous nonintersecting curves. 
For more general stochastic evolution the histories can repeatedly branch out into the future. 

4 Spin Distributions: Separability 

For a quantum system we must remember that not all dynamical variables can be simultaneously 
measured. At any one time a complete commuting set can be measured and this measurement 
yields a collection of spectral values. What about the evolution in time? At a later time this 
set can be measured; thus we can construct a set of histories for a given state; but for this 
to be consistent we must impose some consistency conditions. This becomes necessary since 
states can superpose: so instead of continued branching we have also interfering recombinations. 
If we can assign probabilities for all measurements for several times, we talk of a consistent 
history. [13] 

Any history in which there is no recombination is automatically consistent. The most 
general consistent history is when there is at most one loop on any set of histories: (the 7r's are 
projections) 

|V>J 7Ti 7T 2 7T 3 . . . lpj\ 2 = Pij(7ri7r 2 7r 3 . . .) . (25) 

If these are to be consistent, then 

P ij (n 1 n 2 ...) = 0-i^j;. (26) 

If the 7r's are one-dimensional projections, we can have the pairs of interfering amplitudes be 
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out of phase by 90 degrees and hence provide no interference in probability. [14] 

While consistent histories require nonnegative probabilities, we can raise the question of 
the characteristic function providing a phase space distribution. For canonical variables p, q, 
this is given by the Wigner function. When we want to find the probability distribution for a 
commuting set, namely functions of 

x = \p + fj,q — v (27) 

then we get a nonnegative-time probability distribution: 

P(\, fi^) = fj W(p, q) 5(Xp + Mt-u) dpdq . (28) 

This is a nonnegative distribution. W(q,p) determines the quantum tomogram P(X,/i, u); and 
if the P(\,p,u) is known for all A,/i, v, then it completely determines W(p ) q). Note that the 
tomograms scale according to 

P(A, H = ^ P(\v , fi/vo, v/v ). (29) 

so that it is really a function of only two variables. 

In the more general case of any quantum system we could compute a 'master distribution 
function' from which the marginal distributions for any subset of variables can be computed by 
integrating over all the unwanted variables. This is particularly useful in computing probability 
distributions for spin projections along various directions. No two of these commute so that the 
master distribution is not positive definite. For example, given a spin-1/2 object the distribution 
in a state p for s • rii, and s • n 2 is given by 

P(n 1 ,n 2 ) = ((s ■ m)(s ■ n 2 )) = ^ 1 + S Ul 1 + S — ip = tr(p7r(ni) 7r(n 2 )) • (30) 

This quantity is not real, but its marginal distributions are nonnegative. 

i 

P(m) = p K ±) = tripnim)) > . (31) 
+ - 
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Instead of giving the state p if we use the projection " for p we get 

P{n, n u n 2 ) = tr ^ j (32) 

which is in general complex but gives the correct marginal distributions. In place of P(n, ni, n 2 ) 
we could take the symmetrized quantity 

P s (n, 711,712) = symmetrized P{n,n\n 2 ) (33) 

which is real but not positive definite. These also give the correct (symmetrized) marginal 
distributions. 

These observations give us the proper tools for dealing with Bell's inequalities. If we consider 
the master distributions for three directions 711,712,713 , then 

P(n 1 ,n 2 ) = P(ni,n 2 ,n 3 ) etc. (34) 

713=± 

If P{n\n 2 n^) were a classical nonnegative probability distribution, then these two-point corre- 
lations satisfy the triangle inequalities 



P(ni,n 2 ) +P(n 1 ,n 3 ) > P(n 2 ,n 3 ) etc. 



(35) 



However, quantum mechanics gives the unsymmetrized expression 

P(riin 2 n 3 ) = tr {(1 + s ■ 7ii)(l + s • n 2 )(l + s ■ n 3 )} (36) 
which on symmetrization yields 

^(1 +COS012 +COS023 +COS0 3 i) . (37) 

If we suitably choose #12, #23, #31, we can easily violate the triangle inequalities and hence 
Bell's inequalities [16]. 

In the vast literature on Bell's inequalities many authors ascribe the violation of Bell's 
inequalities to the essential nonlocality (nonseparability) of quantum mechanics! But we see 
that if we accept indefinite master distributions, there is no need to invoke nonlocal properties, 
least of all to nonlocal hidden variables. 
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5 Does Reversible Dynamics Furnish an Arrow of Time? 

We now turn our attention to another fundamental problem: How to obtain irreversibility and 
an arrow of time from a closed dynamical system with time-reversible equations of motion? We 
have indicated already why the demonstrations of an arrow may be an unjustified conclusion. 
Note that any approximation in calculating a unitary operator would lead to nonunitarity and 
a dissipative evolution. But such a 'mistake' is not a demonstration of the arrow of time! If 
the system is open the forward evolution is made stochastic by keeping the interaction with 
the external system; since we use this only for the forward evolution but not the backward 
evolution, we have put in an arrow of time 'by hand'. 

6 Decaying States in Quantum Mechanics 

The most notable case of this kind is the decay of a metastable state (or an unstable particle). 
The quantum theory of spontaneous deexcitation of an excited atom was formualted by Dirac. 
[17] Under the influence of the perturbative radiation field coupling , the excited state goes into 
a superposition of the excited state and a continuum of photons plus the ground state. This is 
a unitary transformation and phase relations are preserved. If we develop the state backwards 
in time, we get to the pure excited state; and even further back a coherent combination of the 
excited state with the ground state. No irreversibility and no arrow of time obtains at this stage 
of computations. However, if we ask what is the probability of the survival of the unexcited 
state, then we get what appears to be a decay: but that is no more than the 'decay' of the 
magnitude of the square of the x-component of a vector rotating about the origin. Only thing 
to be specially invoked is the continuum of photon frequencies and the more or less montonic 
decrease in the probability of the survival of the excited state. By deliberately ignoring the 
phase relations we obtain a probability distribution. It may be approximated by a dissipative 
stochastic process, dominated by an exponential decay; and it appears to provide an irreversible 
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process. But if we had traced it back, we would get an antidissipative process. There is no 
arrow unless we deliberately ignore the negative time propagation. 

7 Analytic Continuation of the Resolvent: Dual Space 
Formalism 

The problem can be treated in a more precise fashion by looking for the exact time development 
using the resolvent of the total Hamiltonian. This resolvent can be analytically continued into 
the complex plane without changing the time development [18]. The complex spectrum of the 
resolvent may be identified with the analytic continuations of the self-dual Hilbert space into 
a pair of dual vector spaces. Depending on whether we are interested in the forward time 
development or the backward time development, the analytically continued spectrum is most 
conveniently chosen to be in the lower or the upper half plane. Both evolutions are irreversible; 
but if we make approximations, we get semigroups which are not automatically reversible. The 
irreversibility is not in the system but the mutilation of the time development. 

8 Concluding Remarks: Quantum Radiative Transfer 

This reversibility paradox is well illustrated by the time development of a minimum uncertainty 
wave packet. It appears that the wave packet spreads in the future but it also spreads in the 
past. The increase in width depends on the elapsed time squared. The propagation of partially 
coherent light may be useful to illustrate the difference between the unitary evolution of the 
amplitude and the 'irreversible' spreading of the intensity. Classical radiative transfer uses 
the spectral intensity distribution. The classical radiative transfer theory[19] is not sufficient 
when we have interference, like passage of light through a double slit or through a diffraction 
grating. We do not have consistent histories since interference dominates. So the language 
used in 'delayed choice' and other gedanken experiments is somewhat inappropriate. But we 
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can have a generalized radiative transfer formulation in which indefinite spectral and angular 
distribution are used [20]. The Wolf function, which is the analogue of the Wigner function 
generalized to a field, would then be the object of the generalized radiative transfer and it could 
treat all light propagators. If we include polarization also, we need polarization 2x2 matrices in 
place of specific intensities with angular dependence. We hope to present this in a subsequent 
paper. 
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